Contextual Sentence Decomposition
نویسندگان
چکیده
In this thesis, we introduce and study contextual sentence decomposition, which, intuitively, decomposes a given sentence into parts that semantically “belong together”. For example, a valid decomposition of the sentence “Usable parts of rhubarb include the edible stalks and the medicinally used roots, however its leaves are toxic” are the sub-sentences “Usable parts of rhubarb include the edible stalks”, “Usable parts of rhubarb include the edible stalks” and “however its leaves are toxic”. Our motivation for this problem comes from semantic full-text search. For a query plant edible leaves, semantic full-text search returns passages where instances of a plant, such as “rhubarb” (and not the word “plant”), are mentioned along with the words “edible” and “leaves”. One of the results this query might erroneously return is the original sentence above. With contextual sentence decomposition we avoid this false-positive, while at the same time maintaining the true factual contents of the original sentence. We propose two approaches for our problem, one based on a set of rules and one using machine learning. On a manually assembled ground truth, we achieve an F-measure of about 65 percent for the former and of 40 percent for the latter. For the semantic full-text search based on these approaches, evaluated on the English Wikipedia (27 GB of raw text), we achieve improvements nearly doubling the F-measure for some queries.
منابع مشابه
Open Information Extraction via Contextual Sentence Decomposition1
We show how contextual sentence decomposition (CSD), a technique originally developed for high-precision semantic search, can be used for open information extraction (OIE). Intuitively, CSD decomposes a sentence into the parts that semantically “belong together”. By identifying the (implicit or explicit) verb in each such part, we obtain facts like in OIE. We compare our system, called CSD-IE, ...
متن کاملModelling pause duration as a function of contextual length
Effects of contextual length are known to affect pause durations in neutral speech. The present study investigates these effects on an expressive corpus of read tales in French. Computational models of intra-sentence, and inter-sentence pause durations, as functions of contextual lengths are proposed. These models are aimed at improving Text-To-Speech synthesis systems, and provide clues for sy...
متن کاملAn effective sentence-extraction technique using contextual information and statistical approaches for text summarization
This paper proposes an effective method to extract salient sentences using contextual information and statistical approaches for Text Summarization. The proposed method combines two consecutive sentences into a bi-gram pseudo sentence so that contextual information is applied to statistical sentence-extraction techniques. Salient bigram pseudo sentences are first selected by the statistical sen...
متن کاملUsing determiners as contextual cues in sentence comprehension: A comparison between younger and older adults
Younger adults use both semantic and phonological cues to quickly and efficiently localize the referent during sentence comprehension. While some behavioral studies suggest that older adults use contextual information even more strongly than younger adults, ERP studies have shown that this population, as a group, is less apt at using contextual semantic cues to predict upcoming words. The curre...
متن کاملHomographic Ideogram Understanding Using Contextual Dynamic Network
Conventional methods for disambiguation problems have been using statistical methods with co-occurrence of words in their contexts. It seems that human-beings assign appropriate word senses to the given ambiguous word in the sentence depending on the words which followed the ambiguous word when they could not disambiguate by using the previous contextual information. In this research, Contextua...
متن کامل